Chapter 5 Results
5.1 Work-Residence Data and Covid Case Growth
5.1.1 Choropleth Map showing Commuter-packed Counties and Covid Cases.
In our exploratory analysis, we discovered that commuters have a positive correlation with the Covid case grow. The following choropleth map shows some interesting patterns:
<<<<<<< HEAD ======= >>>>>>> 9106db3393f78ffa864ebd36c958970aac397c88The heatmap shows the percentage of labor working outside county of residence. Deeper the color of the county is, the more ratio of workforce commutes outside the county. The teal colored circles represent the number of cumulative cases within each month of 2020. The general pattern indicates and deeper the color is, the more likely that corresponding county and the surroundings of which experience sharp growth in covid numbers.
5.1.2 Time Series Comparison between Two Counties Covid Growth
Comparing the areas of the circles is not an intuitive way to understand the pattern difference. To better understand the difference, we compare two counties marked on the map above (Warren ~ 50% and Lewis ~ 15%) that have very different commuter percentages and show their case growth below:
<<<<<<< HEAD ======= >>>>>>> 9106db3393f78ffa864ebd36c958970aac397c88In this graph, the green plot and the teal plot represents the cumulative postives in Warren and Lewis county respectively. As we observe in the graph, Warren shows a more rapid growth in covid numbers than Lewis does.
5.2 Income Data and Covid Case Growth
In our exploratory visualization, we attempt to uncover any pattern that links the poverty rate to Covid impact.
5.2.1 Scatterplot showing Covid Positive Rate and Poverty Rate
The graph below shows the Covid Positive Rate plotted against Poverty Rate of each county. The positive rate is derived from dividing the Nov 28th cumulative positives by total census population.

Surprisingly, the general correlation is rather low (-0.03). However, certain outliers reveal some interesting information. The outliers that are labeled on the scatter plot are mostly counties in or near New York metropolitan area, which are hit hardest due to the high population density within the area. If we only look at the counties within the New York metropolitan area or only the counties outside New York metropolitan area, we could observe a positive correlation between Poverty Rate and Cumulative Case counts.
5.3 Race Data and Covid Case Growth
In our investigation, we attempt to discover any correlation between race data and Covid impact in each county.
5.3.1 Biased Racial Data
The racial data with respect to each county is drastically different depending on whether the counties are in New York Metropolitan or not. The graph below indicates that if we include all the counties in our exploratory analysis, we probabily won’t find any meaningful results due to those outliers. In that sense, we will separate all the metropolitan area counties and surrounding counties from the rest of the counties before attempting to find any correlation.

5.3.2 Scatterplots Showing Correlation Between Race and Covid Case Growth in/outside Metropolitan Areas

The graph in display shows the scatterplots of Covid case percentage against race distributions in Metropolitan Areas.

The graph in display shows the scatterplots of Covid case percentage against race distributions outside Metropolitan Areas.
Overall, we observe some patterns indicating that higher percentage of people of color, higher the covid impact seems to be. However, we immediately noticed that the correlation comes from Third-cause fallacy. The high population density can cause both high number of Covid percentage and high percentage of people of color at the same time.
That said, there are some information we can obtain from the scatterplots. It is easy to observe that if we remove outliers which represents densely populated counties from the plot, correlation is rather weak. It suggests that all counties in New York are affected by Covid situation regardless the race distribution.